Computational Feature-Sensitive Reconstruction of Language Relationships: Developing the ALINE Distance for Comparative Historical Linguistic Reconstruction

نویسندگان

  • Sean S. Downey
  • Brian Hallmark
  • Murray P. Cox
  • Peter Norquest
  • J. Stephen Lansing
چکیده

Historical relationships among languages are used as a proxy for social history in many non-linguistic settings, including the fields of cultural and molecular anthropology. Linguists have traditionally assembled this information using the standard comparative method. While providing extremely nuanced linguistic information, this approach is timeconsuming and labor-intensive. Conversely, computational approaches are appreciably quicker, but can potentially introduce significant error. Furthermore, current methods often use cognate sets that were themselves coded by historical linguists, thus reducing the benefit of computational approaches. Here we develop a method, based on the ALINE distance, to extract feature-sensitive relationships from paired glosses, datasets that require minimal contribution from trained linguists beyond transcription from primary sources. We validate our results by comparison with data generated independently via the comparative method, and quantify error rates using consistency indices. To showcase our method’s utility and to demonstrate its robustness at local and regional scales, we apply it to two language datasets from eastern Indonesia. As linguistic datasets proliferate, scalable computational methods that mimic historical linguistic reconstruction will become increasingly necessary. Although at present we cannot disentangle all the processes driving linguistic change (e.g. lexical borrowing), our method provides a robust and accurate alternative to manual linguistic analysis. The feature-sensitive method adopted here accurately and automatically identifies emergent patterns hidden in *Address correspondence to: J. Stephen Lansing, Department of Anthropology, University of Arizona, Tucson, AZ 85721, United States of America. E-mail: [email protected] Journal of Quantitative Linguistics 2008, Volume 15, Number 4, pp. 340–369 DOI: 10.1080/09296170802326681 0929-6174/08/1504034

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Can Corpus Based Measures be Used for Comparative Study of Languages?

Quantitative measurement of inter-language distance is a useful technique for studying diachronic and synchronic relations between languages. Such measures have been used successfully for purposes like deriving language taxonomies and language reconstruction, but they have mostly been applied to handcrafted word lists. Can we instead use corpus based measures for comparative study of languages?...

متن کامل

Quantitative methods for Phylogenetic Inference in Historical Linguistics: An experimental case study of South Central Dravidian

In this paper we examine the usefulness of two classes of algorithms Distance Methods, Discrete Character Methods (Felsenstein and Felsenstein 2003) widely used in genetics, for predicting the family relationships among a set of related languages and therefore, diachronic language change. Applying these algorithms to the data on the numbers of shared cognateswith-change and changed as well as u...

متن کامل

Quantitative methods for Phylogenetic Inference in Historical Linguistics : An experimental case study of South Central

In this paper we examine the usefulness of two classes of algorithms Distance Methods, Discrete Character Methods (Felsenstein and Felsenstein 2003) widely used in genetics, for predicting the family relationships among a set of related languages and therefore, diachronic language change. Applying these algorithms to the data on the numbers of shared cognateswith-change and changed as well as u...

متن کامل

Collaborative Output Tasks and their Effects on Learning English Comparative Adjectives

This study aimed to examine the effect of two types of collaborative output tasks on Iranian EFL learners’ comparative adjectives with two or more syllables. Thirty Iranian EFL learners participated in this study which were then divided into two experimental and one control groups; one experimental group received dictogloss task in 4-pairs and the other experimental group was given text reconst...

متن کامل

An interview on linguistic variation with

Giuseppe Longobardi is Anniversary Professor of Linguistics in the Department of Language and Linguistic Science of the University of York. He has worked in theoretical and comparative syntax, with a special interest in the study of the structural correspondence of meaning. The structure of nominal expressions and the syntax of negation and negative quantifiers are two of the most prominent top...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Quantitative Linguistics

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2008